生成モデル入門：分類から超越する

私たちは、判別モデルという手法から、条件付き確率 $P(y|x)$ を学習することで分類や回帰の課題を解決していたものへと移行しています。そして、高度な領域である生成モデルに進んでいます。現在の核心的な目標は密度推定、訓練データ $X$ の元となる確率分布 $P(x)$ を完全に学習することです。この根本的な転換により、高次元データセット内の複雑な依存関係や構造を捉えられるようになり、単なる境界分離を超えて、真のデータ理解と合成が可能になります。

1. 生成モデルの目的：$P(x)$ のモデリング

生成モデルの目的は、訓練データ $X$ が起源とする確率分布 $P(x)$ を推定することです。成功した生成モデルは以下の3つの重要なタスクを実行できます：(1) 密度推定（入力 $x$ に確率スコアを割り当てる）、(2) サンプリング（新たなデータポイント $x_{new} \sim P(x)$ を生成する）、(3) 検出なし特徴学習（潜在空間内で意味のある、分離された表現を発見する）。

2. 分類：明示的と暗黙的尤度

生成モデルは、尤度関数へのアプローチによって根本的に分類されます。明示的密度モデル、例えば変分オートエンコーダー（VAE）およびフロー・モデルは、数学的な尤度関数を定義し、それを最大化しようとする（またはその下限を最大化する）。暗黙的密度モデル、特に有名なのは生成的対抗ネットワーク（GAN）は、尤度計算をまったく回避し、代わりに敵対的訓練フレームワークを使って、分布 $P(x)$ からのサンプリングを行うマッピング関数を学習します。

Data Synthesis and Feature Interpolation

Generative models demonstrate their capability by generating novel, high-fidelity instances (e.g., unseen faces, complex textures) or by allowing semantic interpolation in the learned latent space, illustrating the model's grasp of data variability.

Examples of AI-generated faces and interpolated features.

Question 1

In generative modeling, what is the primary distribution of interest?

$P(x)$

$P(y|x)$

$P(x|y)$

$P(y)$

Question 2

Which type of generative model relies on adversarial training and avoids defining an explicit likelihood function?

Variational Autoencoder (VAE)

Autoregressive Model

Generative Adversarial Network (GAN)

Gaussian Mixture Model (GMM)

Challenge: Anomaly Detection

Leveraging Density Estimation

A financial institution has trained an explicit density generative model $G$ on millions of legitimate transaction records. A new transaction $x_{new}$ arrives.

Goal: Determine if $x_{new}$ is an anomaly (fraud).

Step 1

Based on the density estimate of $P(x)$, what statistical measure must be evaluated for $x_{new}$ to flag it as anomalous?

Solution:
The model must evaluate the probability (or likelihood) $P(x_{new})$. If $P(x_{new})$ falls below a predefined threshold $\tau$, meaning the new point is statistically improbable under the learned distribution of normal transactions, it is flagged as an anomaly.